Skip to content

feat(builder): add ephemeral HTML quick deployments#3208

Open
kilo-code-bot[bot] wants to merge 3 commits into
mainfrom
feat/html-quick-deploy
Open

feat(builder): add ephemeral HTML quick deployments#3208
kilo-code-bot[bot] wants to merge 3 commits into
mainfrom
feat/html-quick-deploy

Conversation

@kilo-code-bot
Copy link
Copy Markdown
Contributor

@kilo-code-bot kilo-code-bot Bot commented May 12, 2026

Summary

Why

Generated static sites need a lightweight publishing path that does not create a repository-backed deployment or start the sandbox build pipeline. This adds an authenticated builder boundary for raw HTML and multipart static bundles while keeping public aliases separate from private Worker identifiers and making ephemeral cleanup durable across partial external failures.

What was done

  • Add authenticated POST /deploy-html support for raw HTML and multipart static bundles with a 10 MB body limit, static-path validation, a per-user Cloudflare rate limit, and friendly adjective-noun-xxxxxxxx aliases. This PR exposes infrastructure capability only; it does not add a Kilo app UI or in-repository caller.
  • Persist quick-deploy lifecycle state in Postgres deployments_ephemeral: insert a pending row before provisioning, activate it after alias mapping, and use claim-safe five-minute cleanup sweeps for expired or failed resources.
  • Deploy each site under a private qdpl-<uuid> Worker name, reject direct private-host routing, and map the friendly alias through the dispatcher KV API with guarded forward/reverse compensation.
  • Best-effort enable the existing Made with Kilo App Builder banner for quick-deployed HTML, without failing site availability when banner setup fails.
  • Scrub ephemeral ownership and schedule immediate cleanup during GDPR soft-delete, and share slug generation and validation through @kilocode/worker-utils/deployment-slug without changing persistent deployment URL formats.

High-level architecture

sequenceDiagram
  actor Client as Quick-deploy client
  participant Builder as Builder Worker
  participant DB as Hyperdrive / Postgres
  participant CF as Cloudflare API + private qdpl Worker
  participant Dispatcher as Dispatcher API / public router
  participant KV as DEPLOY_KV

  Client->>Builder: POST /deploy-html with bearer token and static assets
  Builder->>Builder: Authenticate, rate-limit, validate upload and TTL
  Builder->>DB: Lock retained owner and insert pending lifecycle row
  Builder->>CF: Deploy assets under private qdpl UUID
  Builder->>DB: Check stored and ephemeral slug conflicts
  Builder->>Dispatcher: PUT /api/slug-mapping/{worker}
  Dispatcher->>KV: Update reciprocal alias keys with compensation
  Builder->>Dispatcher: PUT /api/app-builder-banner/{worker} (best effort)
  Dispatcher->>KV: Enable banner key
  Builder->>DB: Activate row with alias and expiry
  Builder-->>Client: Return friendly URL and expiry

  Client->>Dispatcher: GET friendly public URL
  Dispatcher->>KV: Resolve slug2worker mapping
  KV-->>Dispatcher: Return private qdpl UUID
  Dispatcher->>CF: Dispatch request to private Worker
  CF-->>Dispatcher: Return static response
  Dispatcher-->>Client: Return response, injecting banner when enabled

  opt Five-minute scheduled cleanup
    Builder->>DB: Claim bounded due rows with SKIP LOCKED
    par Independent teardown attempts
      Builder->>Dispatcher: DELETE alias mapping
    and
      Builder->>Dispatcher: DELETE banner key
    and
      Builder->>CF: Delete private Worker and assets
    end
    alt All teardown targets succeed
      Builder->>DB: Delete claimed lifecycle row
    else Any teardown target fails
      Builder->>DB: Clear claim and retry after five minutes
    end
  end
Loading

Architecture decision

Decision: Store ephemeral deployment lifecycle state in Postgres while keeping dispatcher alias routing KV-only.

Context: Quick deployments need ownership-aware cleanup, GDPR handling, retry scheduling, and recoverable claims without holding database locks across Cloudflare or dispatcher I/O. Public requests still need the existing low-latency dispatcher routing model.

Rationale: A pending Postgres row is written before external provisioning, then transitioned with compare-and-swap semantics after mapping succeeds. Scheduled workers claim bounded due rows with FOR UPDATE SKIP LOCKED, tear down alias, banner, and Worker resources independently, and retain retry state after partial failure. Direct KV alias updates stay narrow and use guarded compensation for reciprocal keys.

Consequences: Cleanup work is observable and retryable in Postgres, but alias routing remains KV-backed and eventually consistent. Alias mutations are not globally serialized, so the guarded remap and deletion compensation paths deserve focused review.

Verification

  • Submitted an authenticated Postgres-backed HTML quick deployment and opened the returned public URL.

Visual Changes

  • No Kilo app UI changes.
  • Quick-deployed HTML pages best-effort reuse the existing Made with Kilo App Builder banner; no new banner design is introduced.

Reviewer Notes

  • Reconcile the generated Postgres migration before merge: this branch currently adds 0153_nosy_psynapse, while current main already contains migrations 0153 and 0154. Regenerate migration artifacts after rebasing; do not hand-edit them.
  • Roll out in order after migration reconciliation: apply the Postgres migration, deploy dispatcher compatibility and routing changes, deploy web mapping integration, then deploy builder cleanup changes.
  • Set dispatcher BACKEND_AUTH_TOKEN, builder DISPATCHER_AUTH_TOKEN, and web USER_DEPLOYMENTS_DISPATCHER_AUTH_KEY to the same dispatcher management API credential.
  • Review shared dispatcher remap and deletion compensation carefully: ordinary persisted deployment create and rename flows use the same /api/slug-mapping/:worker seam.
  • Postgres cleanup claims at most 25 due rows per five-minute sweep, retries failed teardown after five minutes, and recovers stale claims after 20 minutes.
  • Admission is intentionally narrow: Cloudflare rate limiting remains, but this slice does not enforce an exact outstanding ephemeral-resource quota.


async function cleanupExpiredDeployments(env: Env): Promise<void> {
const now = Date.now();
const list = await env.HTML_DEPLOYMENTS_KV.list({ prefix: KV_KEY_PREFIX });
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: KV list() is not paginated — deployments beyond the first 1000 keys will never be cleaned up

Cloudflare KV's list() returns at most 1000 keys per call and sets list.list_complete = false when there are more. The cleanup loop needs to iterate until list_complete is true, using cursor to fetch subsequent pages:

let cursor: string | undefined;
do {
  const list = await env.HTML_DEPLOYMENTS_KV.list({ prefix: KV_KEY_PREFIX, cursor });
  for (const key of list.keys) {
    // ... existing cleanup logic ...
  }
  cursor = list.list_complete ? undefined : list.cursor;
} while (cursor);

Without this, once the deployment count exceeds 1000 the oldest entries are silently skipped and the Cloudflare workers they reference are never deleted.

for (const [key, value] of formData.entries()) {
if (!(value instanceof File)) continue;

const path = key;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Multipart form field keys are used directly as deploy paths with no sanitisation — path traversal sequences are accepted

key (the HTML form field name) is assigned directly to path with no validation. A caller can submit a field named ../../evil or __proto__ and it will be passed straight through to validateStaticAssets and ultimately into the deployer. At minimum, paths should be checked for .. components and leading / before use:

if (key.includes('..') || key.startsWith('/')) {
  throw new Error(`Invalid file path: "${key}"`);
}

}

const record: HtmlDeployRecord = { slug, expiresAt };
await c.env.HTML_DEPLOYMENTS_KV.put(`${KV_KEY_PREFIX}${slug}`, JSON.stringify(record));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: KV write after a successful deploy is not error-handled — a failure here orphans the deployed worker permanently

If HTML_DEPLOYMENTS_KV.put() throws (e.g. quota exceeded, transient error), the Cloudflare worker has already been deployed but no cleanup record is created. The cron will never delete it because it has no KV entry. The error should be caught, logged/reported to Sentry, and ideally the newly-deployed worker should be torn down before returning a 500.

"kv_namespaces": [
{
"binding": "HTML_DEPLOYMENTS_KV",
"id": "TODO: create with `wrangler kv namespace create HTML_DEPLOYMENTS_KV` and fill in",
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL: Placeholder string "TODO: create with ..." is not a valid KV namespace ID

Deploying with this config will fail at the Wrangler validation step (wrangler deploy). The KV namespace must be created and its real UUID filled in before this can be deployed to production.

Suggested change
"id": "TODO: create with `wrangler kv namespace create HTML_DEPLOYMENTS_KV` and fill in",
"id": "REPLACE_ME_WITH_REAL_KV_NAMESPACE_ID",

@kilo-code-bot
Copy link
Copy Markdown
Contributor Author

kilo-code-bot Bot commented May 12, 2026

Code Review Summary

Status: 1 Issue Found | Recommendation: Address before merge

Executive Summary

The void ctx.blockConcurrencyWhile(...) migration race in HtmlDeployRegistry remains unresolved — all other previously flagged issues have been fixed, and the new persistence/cleanup infrastructure looks correct.

Overview

Severity Count
CRITICAL 1
WARNING 0
SUGGESTION 0

✅ Fixed in latest commits

File Issue
services/deploy-infra/builder/wrangler.jsonc Placeholder KV namespace ID removed — replaced by proper DO binding
services/deploy-infra/builder/src/index.ts KV list() pagination — resolved (KV replaced by DO)
services/deploy-infra/builder/src/index.ts Path traversal / hidden file validation — present in validator.ts
services/deploy-infra/builder/src/index.ts KV.put() unhandled after deploy — replaced with DO registry + rollback
services/deploy-infra/builder/src/html-deploy/slug.ts Deep relative cross-package path — slug generator moved to packages/worker-utils and re-exported cleanly
Issue Details (click to expand)

CRITICAL

File Line Issue
services/deploy-infra/builder/src/html-deploy/registry.ts 112 void ctx.blockConcurrencyWhile(...) discards the Promise — migrations may not complete before the first RPC call arrives, risking operations against a pre-migration database
Files Reviewed (16 files)
  • services/deploy-infra/builder/src/html-deploy/registry.ts — 1 issue
  • services/deploy-infra/builder/src/html-deploy/handler.ts — 0 issues
  • services/deploy-infra/builder/src/html-deploy/ephemeral-cleanup.ts — 0 issues
  • services/deploy-infra/builder/src/html-deploy/repository.ts — 0 issues
  • services/deploy-infra/builder/src/html-deploy/public-slug.ts — 0 issues
  • services/deploy-infra/builder/src/html-deploy/stored-slug.ts — 0 issues
  • services/deploy-infra/builder/src/html-deploy/dispatcher-client.ts — 0 issues
  • services/deploy-infra/builder/src/html-deploy/quota.ts — 0 issues
  • services/deploy-infra/builder/src/html-deploy/cleanup.ts — 0 issues
  • services/deploy-infra/builder/src/html-deploy/slug.ts — 0 issues
  • services/deploy-infra/builder/src/index.ts — 0 issues
  • services/deploy-infra/builder/wrangler.jsonc — 0 issues
  • services/deploy-infra/dispatcher/src/routes/api.ts — 0 issues
  • apps/web/src/lib/user-deployments/deployments-service.ts — 0 issues
  • apps/web/src/lib/user-deployments/dispatcher-client.ts — 0 issues
  • packages/db/src/schema.ts — 0 issues

Fix these issues in Kilo Cloud


Reviewed by claude-4.6-sonnet-20260217 · 2,758,105 tokens

Review guidance: REVIEW.md from base branch main

@kilo-code-bot
Copy link
Copy Markdown
Contributor Author

kilo-code-bot Bot commented May 12, 2026

Review response

CRITICAL: KV namespace TODO placeholder — This requires running wrangler kv namespace create HTML_DEPLOYMENTS_KV in the target Cloudflare account before deploy. I've kept it as a placeholder since the namespace must be created in the production account. This should be done as a separate infra step before merging.

WARNING: KV write failure — Fixed in 2d9874e. If HTML_DEPLOYMENTS_KV.put() fails after a successful deploy, the worker is now rolled back (deleted) and a 500 is returned to the client. No orphaned workers.

Already fixed in earlier pushes:

  • KV pagination (cursor-based do...while loop)
  • Path traversal and dotfile rejection in multipart field names

constructor(ctx: DurableObjectState, env: Env) {
super(ctx, env);
this.db = drizzle(ctx.storage, { logger: false });
void ctx.blockConcurrencyWhile(() => migrate(this.db, migrations));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL: void ctx.blockConcurrencyWhile(...) discards the Promise — migrations may not complete before the first RPC arrives

Using void here means the blockConcurrencyWhile call is fire-and-forget. Cloudflare's DO documentation requires await so that the runtime holds off all incoming requests until the callback resolves. Without await, a concurrent register() or alarm() call could execute against an uninitialized (pre-migration) database and either fail or silently corrupt state.

Suggested change
void ctx.blockConcurrencyWhile(() => migrate(this.db, migrations));
await ctx.blockConcurrencyWhile(() => migrate(this.db, migrations));

@eshurakov eshurakov changed the title feat(builder): add /deploy-html endpoint for ephemeral static site deploys feat(builder): add ephemeral HTML quick deployments May 26, 2026
@eshurakov eshurakov force-pushed the feat/html-quick-deploy branch from 25de5b1 to 77f46a4 Compare June 3, 2026 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant